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Abstract — The problem of preserving privacy when a multi- 
variate source is required to be revealed partially to multiple 
users is modeled as a Gray-Wyner source coding problem with 
K correlated sources at the encoder and K decoders in which 
the k th decoder, k = 1,2, ...,K, losslessly reconstructs the k th 
source via a common link of rate Ro and a private link of rate 
Rk. The privacy requirement of keeping each decoder oblivious 
of all sources other than the one intended for it is introduced 
via an equivocation constraint Ek at decoder k such that the 
total equivocation summed over all decoders E > A. The set of 
achievable ({Rk}t=i, Ro, A) rates-equivocation (K + 2) -tuples 
is completely characterized. Using this characterization, two 
different definitions of common information are presented and 
are shown to be equivalent. 

I. Introduction 

Information sources often need to be made accessible to 
multiple legitimate users simultaneously. However, not all 
data from the source should be accessible to all users. For 
example, a computer retailer may need to share the annual 
revenue of all computers sold with all the vendors but share 
vendor-specific sale information only with a particular vendor. 
Similarly, a business consulting firm may share general data 
about a specific market with all clients associated with that 
market but share client-specific strategies with only that client. 
In both cases, one can view sharing the public (shared by all) 
information via a common link and the private information 
via a dedicated link. Maximizing the rate over the common 
link allows the information source (retailer/consulting firm) to 
share the most allowed publicly with all clients; however, the 
privacy guarantee requires that no client has access to private 
data of the other clients. This paper develops an abstract model 
and a methodology to study this problem. 

We model the problem of revealing partial source informa- 
tion to multiple users while keeping the data specific to each 
user private from other users as a Gray-Wyner source coding 
problem with K correlated sources at the encoder and K 
decoders in which the k th decoder, k = 1, 2, K, losslessly 
reconstructs the k th source via a common link of rate Rq and 
a private link of rate Rk- We model the privacy requirement 
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of keeping each decoder oblivious of all sources other than 
the one intended for it via an equivocation constraint Ek at 
decoder k such that the total equivocation summed over all 
decoders E > A. 

Since privacy is an important aspect of this problem, it is 
natural to understand the maximal total equivocation that is 
achievable if the rate on the common link is set to the maxi- 
mum achievable. On the other hand, imposing the constraint 
of maximal total equivocation may lead to perhaps a different 
limit on the maximal rate on the common link. In this paper, 
we show that both requirements, which are formally different 
definitions, yield the same formulation for the maximal rate on 
the common link. In keeping with the literature, this common 
rate is defined as the common information. 

The common information of two correlated random vari- 
ables has been defined independently by Wyner [1] and Gacs- 
Korner [2]. Wyner's definition of common information as 
applied to the two-user Gray-Wyner system (without privacy 
constraints) is the minimum rate on the common link such 
that the total information shared across all three links (one 
common and two private) does not exceed the source entropy. 
On the other hand, the Gacs-Korner common information 
is the maximal entropy of a random variable that two non- 
interacting terminals can agree upon when one terminal has 
access to X n and the other to Y n where X and Y are 
correlated random variables. For two correlated variables X 
and Y, the Wyner common information Cw, the Gacs-Korner 
common information Cqk, and the mutual information of 
the two variables are related as Cgk < I '■>"¥) < Cw- 
Recently, the authors in [3] have generalized Wyner's def- 
inition of common information to K variables, henceforth 
referred to as B (X\ , X2, ■ ■ . , Xk) for K correlated variables. 
While the definition naturally generalizes the two variable 
common information, the resulting common information does 
not satisfy a non-increasing property with K as expected. 

In this paper, we present two different definitions of com- 
mon information: the first is the maximal rate on the common 
link for which the total equivocation is maximized, and the 
second is the maximal rate on the common link such that each 
user losslessly reconstructs its intended source at its entropy. 
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Fig. 1. The generalized Gray-Wyner source network. 



We show that both definitions lead to the same formulation 
for common information C (Xi, X 2 , ■ ■ ■ , Xk)- We present 
many properties of C {X\,X 2 , ■ • ■ , Xk) and specifically show 
that C {X U X 2 , ■ • • , X K ) < B (Xi,X 2 , . . . , X K ). To the best 
of our knowledge this is the first generalization of common 
information that preserves the non-increasing property and one 
whose form can be viewed as a natural generalization of the 
Gacs-Korner common information to K variables. 

The paper is organized as follows. In Section |IlJ we 
present the system model. In Section [Till we present the 
rate-equivocation region, develop a formulation for common 
information in two different ways, and present key properties. 
In Section IIVI we compare our formulation with the in- 
variable generalization of Wyner's common information in [3] 
and illustrate with examples. We conclude in Section [V] 

II. System Model 

We consider the following source network. A centralized 
encoder observes K discrete, memoryless correlated sources, 
{X]}}£ =1 and is interested in communicating source Xk to 
decoder k in a lossless manner. The resources available at the 
encoder comprise two types of noiseless rate-limited links. 
There are K links of finite rate from the encoder to each of 
the K decoders and there is a common link of finite rate to 
all decoders. Figure [TJ shows the source broadcasting network 
in consideration. 

An (n, {M k }^ =1 , Mq) code for this model is defined by 
(K + 1) encoding functions described as 
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■ X, 
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{!,..., M }, 
{!,..., M fe }, 



k = l,...,K, 



(1) 

(2) 



and K decoding functions, 

g k : {1,...,M } x {1, . . . , M k } -> X%, k = l,...,K. 
We define the probability of error at decoder k as 

P e , k = Pr(Xj? ± g k (f (X n )J k (X n ))), 

where X = {Xu} k —i- We define the equivocation at decoder 

k as 

E k = -H(x n \ x^foQT), hQT)), 



and the total equivocation as E = Ylk=i E k - 

Remark 1: Informally, E k captures the average uncertainty, 
and hence privacy achievable, about the remaining (K — 1) 
unintended sources at decoder k. 

An {{Rk} k= i, Ro, A) rate-equivocation (K + 2)-tuple 
is achievable for the source network if there exists an 
(n, {M k }^ =1 , M ) code such that, 

->n-Ro 



M < 2 r ' 
M k <2 nR \ k=l,...,K 
PeM<e k , k = l,...,K 
E > A-e. 



(3) 
(4) 
(5) 
(6) 



We denote by 1Z the region of all achievable ({i? fc }^l 1 , i? , A) 
rate-equivocation (K + 2)-tuples. 

III. Main Contributions 
A. Rate -Equivocation Region 

We state our first result in the following theorem. The proof 
is presented in the appendix. 

Theorem 1: The region 1Z of achievable rates-equivocation 
(K + 2)-tuples for the source network shown in Figure [TJ is 
the union of all (k + 2)-tuples ({R k }^ =1 , Rq, A) that satisfy 

R >I(X 1 ,X 2 ,...,X K] W), (7) 

..,K, (8) 



R k >H{X k \W), k = l,2,. 

A < J2H(X\W,X k ) 

k =i 



(9) 



where the union is over all auxiliary random variables W 
arbitrarily correlated with (Xi, X2, ■ ■ ■ , Xk), and where X = 
(Xi,X 2 , . . . , Xk). 

Remark 2: The rate region TZq-w of the Gray-Wyner net- 
work without additional equivocation constraints is the region 
of (K + 1) rate tuples that satisfy O and ©. 

B. Common Information of K Correlated Variables 

We now present two definitions for the common information 
of K correlated random variables. 

Definition 1: The common information of K correlated 
random variables, C\, is the maximal value of R , such that 

({Rk} k=1 ,Ro, Amax) € K, where 

K 

Amax=^#(X|X fe ). 

k=l 

Definition 2: The common information of K correlated 
random variables, C 2 , is the maximal value of Ro, such that 
({H(X k ) - R Q }K =1 ,R Q ) e 1Z G - W . 

We next state our second result. 

Theorem 2: C\ and C 2 are related as follows: 

Ci = C 2 = max I (X X X 2 . . . X K ; W) . 

W-X k -X\X k . k =1.2,...,K 

(10) 

Proof: From Definition [TJ the achievable equivocation E 
must satisfy 



E > A 



K 

max = ^ H{X\X k ) 

fc=i 



On the other hand, any achievable ({R k }^ =1 ,Ro,E) £ 7Z also 
satisfies 

K 

E<Y,H(X\W,X k ). 

k=l 

We therefore, have the following constraint: 

K K 

^2H(X\W,X k ) >Y,H(X\X k ) 

k=l k=l 

which is equivalent to the following K constraints: 

I(X\X k ;W\X k ) =0, k=l,...,K. (11) 

Therefore, from Definition [JJ C\ is equal to the maximal Ro 
subject to ( fTTI l. which implies that 

Ci= jnax I{X X ,...,X K ;W). 

W-X k -X\X k ,k=l,...,K 

From Definition [2] C 2 is defined as the maximal Ro 
such that R k + R = H(X k ), for k = l,...,K, and 
({R k } k= i, Ro) £ T^g-iv- We therefore have the following 
constraints for fc = 1, . . . , K: 

H(X k ) = R k + R (12) 
>H(X k \W)+I(X 1 ,...,X K ;W). (13) 

These constraints are equivalent to 

l(X\X k ;W\X k ) =0, k = l,...,K. 
Therefore, C 2 can be written as follows: 

C 2 = jnax I(X 1 ,...,X K ;W). 

W-X k -X\X k ,k=l,...,K 



C. Common Information: Properties 

We will now develop some properties of common infor- 
mation of K correlated random variables defined in Theorem 

El 

Proposition 1: The common information of K random vari- 
ables, C {Xx,X<2,, . . . , Xk), is monotonically decreasing in K. 

Proof: Consider an arbitrary W satisfying the Markov 
chain relationship 

W -X k -X\X k , k = l,...,K. (14) 

First consider the following sequence of inequalities: 

I{X 1 ,...,X K _ 1 ,X K ;W) 

= I{X U . . . , X K -i; W) + I(X K ; W\X U X K -i) (15) 

< I(Xi, . . . , X K -i\ W) + I(X 2 , ...,X K ; W\X X ) (16) 

= I{X 1 ,...,X K _ 1 ;W) (17) 

where (fTTI l follows from the Markov chain relationship W — 
X\ — (X2, • ■ • , Xk)- Now consider the following sequence of 



inequalities: 
C{Xi, . . . , Xk) 



fc=i. 



= _max 

w-x k -x\x k , 

< _max 

w-x k -x\x k , 

< _ max 

W-X k -X\(X k ,X K ), 



fc=l, 



,K 



I{X 1 
I(X 1 



k=l,...,(K-l) 



-,X K ;W) 
.,X K -i;W) 
I(Xi, . . . , Xk- 



(18) 
(19) 

x;W) 

(20) 
(21) 



— C(Xi, . . . , Xk-i) 

where ( fT9l follows from ( fTTI l and ( |20~b follows from the fact 
that the Markov chain relationship W — X k —X\ X k implies 
the Markov chain relationship W — X k — X\ (X k , Xk ) ■ Since 
the random variable Xk could be chosen arbitrarily from the 
set (Xi, . . . ,Xk), ( I2TI 1 shows that the common information 
is monotonically decreasing in K. ■ 
Proposition 2: C (X\, X2, Xk) is upper bounded 

as 

C(X 1 ,X 2 ,...,X K ) < min I(X i; X 3 ). (22) 

t^ij,i,3 = l,2,...,K 

Proof: We consider an arbitrary W satisfying (Il4l) . and 
upper bound the following mutual information: 



I(X 1 ,...,X K ;W) = I{X i 
= I{Xi 
</(X, 

= i(x, 
= i(x, 



W) + I{X\X l ;W\X l 
W) 

Xj,W) 



x., 



i{x t -w\x 3 ) 



(23) 
(24) 
(25) 
(26) 
(27) 



where d24b follows from the Markov chain condition W — 
Xi — X\Xi, and (|27| | follows from the Markov chain condition 
W — Xj —Xi. The choice of (i,j) was arbitrary, and therefore, 
the common information is upper bounded by the minimum 
of pairwise mutual information among all pairs, i.e., 



C{X U . 



.,X K )< min/(X l ;X J ). 



IV. Comparison and Examples 

In [1] Wyner defines the common information of two 
correlated random variables (X\, X2) as 



B(X U X 2 



inf I(X 1 ,X 2 ;W). 

x 1 ^-w^>x 2 



One interpretation of this common information can be ob- 
tained from the Gray-Wyner source network. The common 
information B(Xi,X 2 ) of two random variables is given as 
the smallest value of i?o such that (R±, R 2 , Ro) 6 TZq-w 
and R a + Ri + R 2 < H(Xi,X 2 ). Recently, this notion of 
common information was generalized to K correlated random 
variables in [3]. The common information, B(Xi, . . . ,Xk), 
of K correlated random variables, as defined in [3], is given by 
smallest value of Ro such that {{Rk] k=1 , Ro) 6 TZg-w an d 
Ro + ~^2d=i < H(Xi, . . . , Xk)- The common information 
B(Xi, . . . , Xk) is given as 

B(X U ...,X K )=MI(X 1 ,...,X K ;W) 



where the infimum is over all distributions p(w, x\, . . . , xk) 
that satisfy 



^2 p{w,Xi,...,x K ) =p(xi 



wew 



K 



p(xx,...,X K \w) = Y[p( x k\w)- 



(28) 



(29) 



k=l 



It was shown in [3] that B(X\, . . . , Xk) is monotonically 
increasing in K. We believe that any intuitively satisfactory 
measure of common information should satisfy the property 
that the common information should decrease as the number 
of random variables increases. In Proposition Q] we showed 
that our measure of common information indeed satisfies this 
property. 

We next prove a property of B(X\ , . . . , Xk) that helps us in 
comparing it with our common information C(Xi, . . . , Xk)- 

Proposition 3: B (Xi, X2, ■ ■ ■ , Xk) is lower bounded as 
follows: 



ma X I(X i ;X j )<B(X 1 ,X 2 ,...,X K ) 



(30) 



Proof: To prove Proposition [3] consider an arbitrary W 
satisfying the constraints (|28]i-(|29l and the following sequence 
of inequalities: 

I(X 1 ,...,X K ;W)>I(X i ;W) (31) 
>I(X i; Xj) (32) 

where (l32l follows from the Markov chain relationship Xj — 
W — Xj, and from the data processing inequality. In arriving 
at Q21 l. the choice of (i, j) was arbitrary, and therefore we can 
maximize over all pairs (i,j) such that i ^ j to get the best 
possible lower bound in this manner. ■ 
Using Propositions |2] and [3] we have the following: 

C(X 1 ,...,X K ) < mm I(Xf,Xj) 
in 

<m&xI(X l ;X J )<B(X 1 ,...,X K ). (33) 

We will now give two examples to illustrate the usefulness of 
our definition C{X\, . . . , Xk) over B(X\, . . . , Xk)- 

Example 1: Consider K = 3 random variables 
(X 1 ,X 2 ,X 3 ) such that X x ~ Ber(l/2), X 2 = X 1 © N, 
where N ~ Ber((5) and X 3 is independent of (X%, X 2 ). Since 
X 3 is independent of (X\, X 2 ), these sources have nothing in 
common and we should expect the 'common information' to 
be zero. Note that for these sources, min^ I{Xi\Xj) = 0, 
whereas max^j I{Xi \ Xj) = 1 — h(S). Therefore, from d33T >. 
we have 

0<C(X 1 ,X 2 ,X 3 )<0<l-h(S) <B{X 1 ,X 2 ,X 3 ), 

which implies that C(Xi,X 2 ,X-s) = 0, whereas 
B(X!,X 2 ,X 3 ) > for any 6 G (0, 1/2). 

Example 2: Consider K = 3 random variables 
(X 1 ,X 2 ,X 3 ) such that X x = (X Q ,X lp ), X 2 = (X Q ,X 2p ) 
and X 3 = (X ,X 3p ), where (X , X lp , X 2p , X 3p ) 
are all mutually independent. Since Xo appears to 
be the only common part in all three sources, we 



should expect the 'common information' to be equal 
to the entropy of Xq. Note that for these sources, 
min^j I(Xi; Xj) = max^j I(Xf, Xj) = H(Xq). Therefore, 
from (l33l . we have 

< C(X U X 2 ,X 3 ) < H(X ) < B(X U X 2 ,X 3 ), 

It is straightforward to show that for these sources, 

C{X 1 ,X 2) X 3 )=B(X ll X2,X 3 )=H(X ). 

Inspired by the above example, we show the following 
interesting property that in some sense relates C(Xi, . . . , Xk) 
to B(X U ...,X K ). 

Proposition 4: For a set of sources X\ , X 2 , . . . , Xk that 
satisfy 

min I(Xi ;Xj) = max I(X. t ;Xj), (34) 

we have 

C(X 1 ,X 2 ,...,X K ) = min I(Xi;Xj) (35) 
w 

if B(X 1 ,X 2 ,...,X K )=m^ j I(X i ;X j ). (36) 

Proof: The constraint (l34l implies that the mutual 
information I{Xi] Xj) is the same for all i,j £ {1, . . . , K }, 
i ^ j. Let us start with a W* that satisfies the infimization 
constraints for B(Xi, . . . , Xk) and yields 

B(X 1 ,...,X K )=maxI(X i ;X j ) (37) 

= I(X ia ;X jo ), (38) 

for some io ^ jo. For this W*, we have 

I(X io ; X jo ) = max I{X t ; X 3 ) (39) 

= I(X 1 ,...,X K ;W*) (40) 
= I(X l0 -W*)+l(X\X l0 ;W*\X l0 ) (41) 
> I(X io ■ X jo ) + l(X\ X l0 ■ W* \X la ) (42) 

where d42l follows from the fact that W* satisfies the Markov 
relationship Xi — W* — Xj , for all io jo- In the derivation 
of (l42l . io can be chosen arbitrarily due to (l34l >. Therefore, 
(l42l implies that this W* also satisfies 

l(X\X i ;W*\X i )=0 

for all i = 1,...,K. This in turn implies that W* serves 
as a valid choice in the maximization for evaluation of 
C (X\ , . . . , Xk ) ■ Therefore, we obtain the following lower 
bound for C{X X ,...,X K ): 

C(X 1 ,...,X K )= jnax I{X X , . . . , X K ; W) 

W-X k -X\X k ,k=l,...,K 

(43) 
(44) 
(45) 



>I(X U ...,X K ;W*) 
= m&xI(Xi; Xj) 

= min I(Xi ; Xj ) . 



(46) 



Hence, from Proposition [U it now follows that 

if B(Xi,...,X K ) = maxift I(Xi;Xj), then 



C(Xi,...,Xk) — minj^j I{Xi] Xj). We remark here 
that a similar property has been shown for K — 2 by 
Ahlswede and Korner in [4]. ■ 



V. Concluding Remarks 



We have abstracted the problem of privacy in a setting 
where a source interacts with multiple users via the Gray- 
Wyner source coding problem with additional equivocation 
constraints at each user and a total equivocation constraint. 
In addition to developing the rate-equivocation region, we 
have introduced two definitions of common information of K 
correlated variables and shown them both to have a form that 
can be viewed as a X-user generalization of the Gacs-Korner 
common information (see also [4]). 



VI. Appendix: Proof of TheoremQ] 

The converse follows by minor modifications of the con- 
verse proof for the unconstrained Gray-Wyner problem [5] and 
is therefore omitted. We now outline the proof of achievability 
for Theorem Q] 

Codebook generation: Fix an input distribution 

p(w\xi, . . . , xk )■ Generate 2 nI ( Xl '---- XK > w ' > sequences 
according to the distribution Y\^ = iP{wt), and index these 
sequences as w n {i), for i = 1, . . . , 2 nI ( Xl >-> XK '< w \ 
Independently and uniformly bin the X^-sequences 
in 2 nH ( Xk \ w ' ) bins, and index these bins as 
bk,i, • ■ • , b k>2 nH(x h \w), for k = 1, . . . ,K . 

Encoding scheme: Upon observing the (x™, . . . , x J-) se- 
quences, the encoder searches for a w n sequence that is jointly 
typical with these sequences. Using standard arguments (as 
in [6]), it can be shown that the encoder can succeed in 
finding one such ai™ sequence. The encoder sends the index 
of the w n sequence on the public link, for which we require 
Ro > I(Xi, . . . , Xk; W). It sends the bin index of the source 
sequence x k on the private link to decoder k, for which we 
require R k > H(X k \W). 

Decoding: At decoder fc, the decoder looks for a unique x n 
in bin b k (received from the private link), that is jointly typical 
with the w n sequence received from the public link. It can be 
shown that decoder k can reconstruct X k with a vanishingly 
small probability of error. We omit the probability of error 
calculation as it follows from the same arguments as in [5]. 

Equivocation: We show that this coding scheme yields 
the total equivocation stated in Theorem Q] Let Jo denote 
the encoder output for the public link and let J k denote 
the encoder output for the private link to decoder k, for 
k = 1,...,K. For E k , we have the following sequence of 



inequalities: 

Ek = -H(X", . . . ,XJ!_ 1 ,X , k l +1 , . . . , X K \J , J fe ) (47) 



= -H(X n \X£\J ,J k ) (48) 
n 

> -HQT\ J 0! J fe ) - -H{X%\ J , Jfe) (49) 
n n 

1 — n 

> -H(X Jo, Jfe)-efe,„ (50) 
n 

1 — n 1 

= — H(X , J , Jfe) H(J , Jfe) — efc „ (51) 

n n 

1 — n 1 

> - H (X ) - -H(J , Jfe) - e fe ,„ (52) 
n n 

> -H(X n ) - -H(J ) - -H(J k ) - e k , n (53) 
n n n 

> H{X U ...,X K )- I{X U . . . , X K - W) - H{X k \W) 

~ £fe : n (54) 

= H{X 1 ,...,X K \W,X k )-e k , n (55) 
= H(X\W,X k )-e k , n , (56) 



where (l50l follows from Fano's inequality, and (l54l 
follows from the facts that H(J ) < log(|Jo|) = 
nI(X 1 ,...,X K \W), and H(J k ) < log(\J k \) = 
tiH(Xk\W), for k — 1,...,K. Therefore, we have 
that 

K K 

E = J2 E k>Y, H &\W, X k ) - e. 
fc=i fc=i 

Hence, this coding scheme yields an equivocation of A = 

EtiH(x\w,x k ). 
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